AITopics | bc loss

Collaborative filtering (CF) models easily suffer from popularity bias, which makes recommendation deviate from users' actual preferences. However, most current debiasing strategies are prone to playing a trade-off game between head and tail performance, thus inevitably degrading the overall recommendation accuracy. To reduce the negative impact of popularity bias on CF models, we incorporate Bias-aware margins into Contrastive loss and propose a simple yet effective BC Loss, where the margin tailors quantitatively to the bias degree of each user-item interaction. We investigate the geometric interpretation of BC loss, then further visualize and theoretically prove that it simultaneously learns better head and tail representations by encouraging the compactness of similar users/items and enlarging the dispersion of dissimilar users/items. Over six benchmark datasets, we use BC loss to optimize two high-performing CF models. In various evaluation settings (i.e., imbalanced/balanced, temporal split, fully-observed unbiased, tail/head test evaluations), BC loss outperforms the state-of-the-art debiasing and non-debiasing methods with remarkable improvements. Considering the theoretical guarantee and empirical success of BC loss, we advocate using it not just as a debiasing strategy, but also as a standard loss in recommender models.

collaborative filtering, contrastive loss, incorporating bias-aware margin, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.43)

Add feedback

334da4cbb76302f37bd2e9d86f558869-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-14-2025, 04:58:15 GMT

bc loss, mean angle, representation, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.69)

Add feedback

Incorporating Bias-aware Margins into Contrastive Loss for Collaborative Filtering An Zhang Wenchang Ma Xiang Wang T at-Seng Chua Sea-NExT Joint Lab National University of Singapore

Neural Information Processing SystemsAug-14-2025, 04:58:11 GMT

Xiang Wang is the corresponding author.

bc loss, recommendation, representation, (16 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.40)
Asia > China (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Incorporating Bias-aware Margins into Contrastive Loss for Collaborative Filtering

Neural Information Processing SystemsOct-10-2024, 14:36:26 GMT

Collaborative filtering (CF) models easily suffer from popularity bias, which makes recommendation deviate from users' actual preferences. However, most current debiasing strategies are prone to playing a trade-off game between head and tail performance, thus inevitably degrading the overall recommendation accuracy. To reduce the negative impact of popularity bias on CF models, we incorporate Bias-aware margins into Contrastive loss and propose a simple yet effective BC Loss, where the margin tailors quantitatively to the bias degree of each user-item interaction. We investigate the geometric interpretation of BC loss, then further visualize and theoretically prove that it simultaneously learns better head and tail representations by encouraging the compactness of similar users/items and enlarging the dispersion of dissimilar users/items. Over six benchmark datasets, we use BC loss to optimize two high-performing CF models.

bc loss, contrastive loss, incorporating bias-aware margin, (5 more...)

Neural Information Processing Systems

Genre: Play > Prospect > Container > Reservoir (1.00)

Technology:

Information Technology > Communications > Social Media (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.64)

Add feedback

BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR

Liang, Yuhao, Yu, Fan, Li, Yangze, Guo, Pengcheng, Zhang, Shiliang, Chen, Qian, Xie, Lei

arXiv.org Artificial IntelligenceOct-5-2023

The recently proposed serialized output training (SOT) simplifies multi-talker automatic speech recognition (ASR) by generating speaker transcriptions separated by a special token. However, frequent speaker changes can make speaker change prediction difficult. To address this, we propose boundary-aware serialized output training (BA-SOT), which explicitly incorporates boundary knowledge into the decoder via a speaker change detection task and boundary constraint loss. We also introduce a two-stage connectionist temporal classification (CTC) strategy that incorporates token-level SOT CTC to restore temporal context information. Besides typical character error rate (CER), we introduce utterance-dependent character error rate (UD-CER) to further measure the precision of speaker change prediction. Compared to original SOT, BA-SOT reduces CER/UD-CER by 5.1%/14.0%, and leveraging a pre-trained ASR model for BA-SOT model initialization further reduces CER/UD-CER by 8.4%/19.9%.

proc, speaker change, ud-cer, (15 more...)

arXiv.org Artificial Intelligence

2305.13716

Country: Asia > China (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Working with Contrastive Losses part2(Advanced Machine Learning)

#artificialintelligenceNov-14-2022, 23:06:50 GMT

Abstract: Contrastive Learning has recently achieved state-of-the-art performance in a wide range of tasks. Many contrastive learning approaches use mined hard negatives to make batches more informative during training but these approaches are inefficient as they increase epoch length proportional to the number of mined negatives and require frequent updates of nearest neighbor indices or mining from recent batches. In this work, we provide an alternative to hard negative mining in supervised contrastive learning, Tail Batch Sampling (TBS), an efficient approximation to the batch assignment problem that upper bounds the gap between the global and training losses, LGlobal LTrain. TBS \textbf{improves state-of-the-art performance} in sentence embedding ( 0.37 Spearman) and code-search tasks ( 2.2\% MRR), is easy to implement -- requiring only a few additional lines of code, does not maintain external data structures such as nearest neighbor indices, is more computationally efficient when compared to the most minimal hard negative mining approaches, and makes no changes to the model being trained. Abstract: Collaborative filtering (CF) models easily suffer from popularity bias, which makes recommendation deviate from users' actual preferences.

advanced machine learning, bc loss, name and face, (12 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improving Learning from Demonstrations by Learning from Experience

Liu, Haofeng, Chen, Yiwen, Tan, Jiayi, Ang, Marcelo H Jr

arXiv.org Artificial IntelligenceNov-15-2021

How to make imitation learning more general when demonstrations are relatively limited has been a persistent problem in reinforcement learning (RL). Poor demonstrations lead to narrow and biased date distribution, non-Markovian human expert demonstration makes it difficult for the agent to learn, and over-reliance on sub-optimal trajectories can make it hard for the agent to improve its performance. To solve these problems we propose a new algorithm named TD3fG that can smoothly transition from learning from experts to learning from experience. Our algorithm achieves good performance in the MUJOCO environment with limited and sub-optimal demonstrations. We use behavior cloning to train the network as a reference action generator and utilize it in terms of both loss function and exploration noise. This innovation can help agents extract a priori knowledge from demonstrations while reducing the detrimental effects of the poor Markovian properties of the demonstrations. It has a better performance compared to the BC+ fine-tuning and DDPGfD approach, especially when the demonstrations are relatively limited. We call our method TD3fG meaning TD3 from a generator.

ddpgfd, demonstration, experiment, (14 more...)

arXiv.org Artificial Intelligence

2111.08156

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)

Add feedback